Skip to content

Add performance testing scripts for DTypeFront#2005

Open
BenWibking wants to merge 7 commits into
developmentfrom
BenWibking/perftest-scripts
Open

Add performance testing scripts for DTypeFront#2005
BenWibking wants to merge 7 commits into
developmentfrom
BenWibking/perftest-scripts

Conversation

@BenWibking

@BenWibking BenWibking commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Description

Adds performance-testing scripts for the DTypeFront problem with varying integrators.

These are the printed results using the built-in script on H200:

| Method  | FoM us/zone | Mupdate/s | TP wall s | Chem s | Chem % | Rad-noODE % | Hydro % | Hydro-only x | RadSub/HydroStep | Subcyc |
|---------|------------:|----------:|----------:|-------:|-------:|------------:|--------:|-------------:|-----------------:|-------:|
| ROS2S   |      0.0852 |     11.73 |     18.23 |   6.12 |  33.57 |       44.43 |   15.50 |         6.45 |             0.50 |   9.99 |
| Rodas3P |      0.0925 |     10.81 |     19.55 |   7.54 |  38.54 |       41.70 |   14.46 |         6.92 |             0.56 |   9.99 |
| Rodas4P |      0.1139 |      8.78 |     23.91 |  11.45 |  47.89 |       35.84 |   11.90 |         8.40 |             0.70 |   9.99 |
| VODE    |      0.1121 |      8.92 |     23.69 |  10.74 |  45.36 |       37.51 |   12.12 |         8.25 |             0.68 |   9.99 |
| Rodas5P |      0.1441 |      6.94 |     30.10 |  17.41 |  57.85 |       28.80 |    9.62 |        10.40 |             0.90 |   9.99 |

On Frontier, I get:

| Method  | FoM us/zone | Mupdate/s | TP wall s | Chem s | Chem % | Rad-noODE % | Hydro % | Hydro-only x | RadSub/HydroStep | Subcyc |
|---------|------------:|----------:|----------:|-------:|-------:|------------:|--------:|-------------:|-----------------:|-------:|
| ROS2S   |      0.2447 |      4.09 |     49.69 |  27.93 |  56.21 |       31.42 |   10.92 |         9.16 |             0.80 |   9.99 |
| Rodas3P |      0.2658 |      3.76 |     53.97 |  32.18 |  59.63 |       28.96 |   10.10 |         9.90 |             0.88 |   9.99 |
| Rodas4P |      0.3328 |      3.00 |     67.57 |  45.77 |  67.74 |       23.15 |    8.04 |        12.44 |             1.13 |   9.99 |
| VODE    |      0.3360 |      2.98 |     68.21 |  46.43 |  68.07 |       22.90 |    7.97 |        12.55 |             1.14 |   9.99 |
| Rodas5P |      0.3837 |      2.61 |     77.89 |  56.12 |  72.06 |       20.03 |    6.99 |        14.31 |             1.32 |   9.99 |

Related issues

N/A

Checklist

Before this pull request can be reviewed, all of these tasks should be completed. Denote completed tasks with an x inside the square brackets [ ] in the Markdown source below:

  • I have added a description (see above).
  • I have added a link to any related issues (if applicable; see above).
  • I have read the Contributing Guide.
  • I have added tests for any new physics that this PR adds to the code.
  • (For quokka-astro org members) I have manually triggered the GPU tests with the magic comment /azp run.

@BenWibking BenWibking changed the title add performance scripts for DTypeFront add performance testing scripts for DTypeFront Jun 26, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces two new scripts: a Python script (dtypefront_perf_table.py) to parse benchmark logs and generate a performance table, and a Slurm script (dtypefront_gpu_sweep.submit) to run a GPU sweep across different microphysics integrators. Feedback on the Python script highlights potential regex matching failures due to leading whitespace in nested timers, a possible division-by-zero error when subcycles are zero, and compatibility issues with zip(..., strict=True) on older Python versions. Additionally, for the Slurm script, it is recommended to remove an unused parameter from the execution command.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread scripts/python/dtypefront_perf_table.py Outdated
Comment thread scripts/python/dtypefront_perf_table.py
Comment thread scripts/python/dtypefront_perf_table.py Outdated
Comment thread scripts/slurm/dtypefront_gpu_sweep.submit Outdated
@BenWibking BenWibking changed the title add performance testing scripts for DTypeFront Add performance testing scripts for DTypeFront Jun 26, 2026
@BenWibking BenWibking marked this pull request as ready for review June 26, 2026 20:45
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jun 26, 2026
@BenWibking BenWibking enabled auto-merge June 26, 2026 20:45

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7acf74359c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

local build_dir="$1"
local integrator="$2"

cmake -S "${ROOT_DIR}" -B "${build_dir}" -DDTypeFront_INTEGRATOR="${integrator}" -DAMReX_GPU_BACKEND=CUDA -DAMReX_GPU_ARCH=9.0 -DAMReX_SPACEDIM=3

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Build the CUDA sweep in Release mode

For a fresh build directory this cmake invocation leaves CMAKE_BUILD_TYPE unset; the top-level project does not provide a Release default, while the Frontier sweep explicitly passes -DCMAKE_BUILD_TYPE=Release. Since this script is intended to produce performance numbers, the generic CUDA/H200 sweep can benchmark unoptimized binaries and produce misleading or much slower results unless the user happens to reuse a preconfigured Release directory.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant